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Abstract 



The top-quark pair production cross section in 7 TeV centre-of-mass proton proton 
collisions is measured using data collected by the CMS detector at the LHC The mea- 
surement uses events with one jet identified as a hadronically-decaying x lepton and 
at least four additional energetic jets, at least one of which is identified as coming 
from a b quark. The analysed data sample corresponds to an integrated luminosity 
of 3.9 fb -1 recorded by a dedicated multijet plus hadronically-decaying x trigger. A 
neural network has been developed to separate the top-quark pairs from the W+jets 
and multijet backgrounds. The measured value of = 152 ± 12 (stat.) ± 32 (syst.) ± 
3 (lum.) pb is consistent with the standard model predictions. 
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1 Introduction 

Top-quark pairs (tt) are copiously produced at the Large Hadron Collider (LHC) primarily 
through gluon-gluon fusion. The measurements of the tt production cross section and branch- 
ing fractions are important tests of the standard model (SM), since the top quark is expected to 
play a special role in various extensions of the SM due to its high mass (see for example (U[2]|). 

The branching fraction of a top-quark decay to a W boson and a b quark is close to 100% in the 
SM. Therefore, the final states from the top-quark decays are given by the decay mode of the W 
bosons. In this Letter top-quark decays in the "hadronic T + jets" final state are studied. One W 
boson decays into a hadronically-decaying x lepton (Th) and a neutrino with a branching frac- 
tion of 0.1125 x 0.647 [3j and the other one decays to a quark-antiquark pair with a branching 
fraction of 0.676 [3j. Thus, 9.8% of the tf pairs produced are expected to lead to this final state. 

The branching fraction of tt to Th+jets events is expected to be the largest one among those 
with t leptons in the final state. The existence of charged Higgs bosons could give rise to an 
enhanced cross section in this channel. The top quark would decay via t — > H + b and the 
charged Higgs boson subsequently decay to a t lepton via H + — > t + v t . The present status 
of the charged Higgs boson search in tt final states with the Compact Muon Solenoid (CMS) 
detector is described in Ref. (4|. 

In this Letter we present the measurement of the tt production cross section in the Th+jets final 
state in proton-proton collisions at \/s = 7 TeV using data collected by the CMS experiment. It 
is the first such measurement performed using the CMS detector and complements the mea- 
surement performed in the lepton + Th channel [5J. The tf production cross section in the Th+jets 
final state has previously been measured in proton-antiproton collisions at = 1.96 TeV at the 
Tevatron (6j and more recently in proton-proton collisions at yjs = 7 TeV using the ATLAS 
detector [8J. All measurements referenced above have been found to be in agreement with the 
SM expectations. 

2 The CMS detector 

The central feature of the CMS apparatus is a superconducting cylindrical solenoid 6 m in di- 
ameter, which provides an axial magnetic field of 3.8 T. Within the field volume are the silicon 
pixel and strip trackers, the crystal electromagnetic (ECAL) and brass /scintillator hadronic 
(HCAL) calorimeters which provide identification of charged, electromagnetic and hadronic 
particles up to pseudorapidities of \r]\ < 2.5 (trackers) and \rj\ < 3.0 (calorimeters). The pseu- 
dorapidity is defined as rj = — In [tan (0/2)], where 9 is the polar angle measured with respect 
to the positive z axis of the right-handed coordinate system used by the CMS experiment. The x 
axis points towards the centre of the LHC ring, the y axis is directed upward along the vertical 
and the z axis corresponds to the anticlockwise-beam direction. In addition the CMS detec- 
tor has extensive forward calorimetry. Muons are measured in gas detectors embedded in the 
steel return yoke outside the solenoid. The excellent tracker impact parameter resolution of 
~15 jim and transverse momentum (pj) resolution of ~1.5% for 100 GeV particles support a 
robust identification of Th and jets arising from b quark hadronisation. A detailed description 
of the CMS detector can be found elsewhere J5|. 

3 Event simulation 

Monte Carlo (MC) simulation is used to determine the signal efficiency as well as the contribu- 
tion from electroweak and tt background processes (i.e. contributions from the full hadronic, 
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5 Event selection 



lepton + jets, lepton + Th and ThTh channels). The tt signal and background events as well as 
the W/Z + jets events are simulated using the MadGraph (v.5.1.1.0) flO| generator using the 
CTEQ6L1 l|TT| parton distribution functions (PDFs). The simulation of parton showering, frag- 
mentation, hadronisation and decays of short-lived particles, except x leptons, is performed 
by pythia (v.6.424) HT2J. Tau lepton decays are simulated using tauola (v.2.75) EU. Single 
top-quark events are generated using POWHEG (rl380) HH interfaced to PYTHIA and TAUOLA. 
The top-quark mass is set to 172.5 GeV, and the approximate next-to-next-to-leading-order 
(NNLO) tt production cross section of 164 ± 10 pb is calculated using the MSTW2008 next- 
to-next-leading-log PDFs [15]. Simulated events are weighted to reflect the number of multiple 
interactions (pileup) observed in data. Data-to-simulation b-tagging efficiency scale factors are 
applied to correct for the differences between data and simulation. 

4 Dataset 

The total integrated luminosity of the dataset analysed is 3.9 fb _1 . A multijet trigger including 
the presence of a hadronically-decaying x lepton was designed to record pp — > tt — > Th + jets 
events. It consists of two consecutively applied filters, referred to as jet and x filters. The 
jet filter requires the presence of the four central jets reconstructed in the calorimeter (\rj\ < 
2.5, pj > 40 GeV), referred to as calorimeter jets. The x filter requires the presence of one 
isolated particle-flow IITBl x candidate (\rj\ < 2.5, p T > 40 GeV, at least one track with pj > 

5 GeV), matched to one of four trigger jets. Due to the increasing rate of the recorded events 
with the rising instantaneous luminosity, the thresholds on the jets and x lepton were raised 
to pi > 45 GeV during the later part of the data taking period. About 80% of the data were 
recorded with that more restrictive trigger configuration. The overall pp — > tt — > Th + jets 
trigger efficiency is small, approximately 1%, with respect to all generated tf — > Th + jets events. 
The small efficiency is due to the high pj threshold on the hadronically-decaying x lepton. 

5 Event selection 

The object reconstruction relies on the particle-flow technique. The event selection is based on 
the presence of at least four particle-flow jets, reconstructed with the anti-fcx clustering algo- 
rithm [171 [18| with a distance parameter of R = 0.5, and on the presence of one particle-flow 
x candidate reconstructed with the hadron-plus-strip (HPS) identification algorithm [19|. The 
HPS algorithm exploits the ability of the particle-flow to reconstruct resonances in the x decay. 
It considers candidates with one or three charged hadrons and up to two neutral pions, with a 
charge compatible with ±le. 

The x candidates are required to be isolated: the sum of the transverse energies of the addi- 
tional charged hadrons and photons (t decay products excluded) reconstructed in an isolation 
cone of AR = (At]) 2 + (A(p) 2 = 0.5 (where (p is the azimuthal angle in radians) around the 
t candidate should be less than 1 GeV. The x reconstruction efficiency is estimated to be ap- 
proximately 44% for genuine Z — > x + x~ events and has a misidentification efficiency for jets 
of 0.5% |19|. 

Furthermore, X candidates are required to pass discriminators against muons and electrons. 
The discrimination against electrons relies on a boosted decision tree that combines variables 
that characterise the presence of neutral particles reconstructed in the x decay (e.g., number 
of constituents, cluster shapes, energy fractions), as well as the presence of a charged hadron 
and electromagnetic particles (e.g., energy fractions, electron-pion discriminator). To suppress 
the contamination from muons, the leading track of the candidate is vetoed if identified as a 
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muon in the muon detectors. In addition, a single-charged Th candidate should not be identified 
as a minimum ionising particle: the ratio of the sum of the energy deposits in the ECAL and 
HCAL calorimeters associated to the Th candidate over the leading track momentum of the Th 
candidate should be larger than 0.2. 

Three jets are required to have pj > 45 GeV, \rj\ < 2.4 and the Th candidate pj > 45 GeV, 
| rj\ < 2.3. These four objects are explicitly matched to the objects used by the trigger. The 
presence of an additional jet with pj > 20 GeV (t^ candidate excluded) is required. 

Since two b-jets from the top-quark decays are expected in the final state, at least one jet is 
required to be identified as a b-jet using the medium working point of the jet probability al- 
gorithm 1 20 1 . At a misidentification probability for light-flavoured jets of 1%, a b-tagging effi- 
ciency of 60% is achieved for this working point of the tagging algorithm. 

A veto on the presence of loosely isolated electrons and muons is applied to further prevent the 
misidentification of genuine electrons and muons as Th candidates. The isolation requirement is 
defined as 1 1 fx < 0.15, where I is the sum of the ECAL and HCAL transverse energy deposits 
in the calorimeters and scalar values of the track momenta in a 0.3 cone around the lepton 
direction, excluding the lepton pj. 

The momentum imbalance, p!p iss , is defined as the opposite of the vectorial sum of the parti- 
cle transverse momenta, using all particles reconstructed by the particle-flow algorithm. The 
transverse missing energy, E™ ss , is defined as the magnitude of this quantity and is required to 
be greater than 20 GeV to reject the multijet background and to achieve a good separation for 
the input variables used in the artificial neural network described in Section [6] Events which 
pass this set of criteria constitute the preselected sample, from which the yield is extracted. 

The trigger efficiencies have been measured in data, determining separately the efficiency of a 
single jet and a single Th to pass the trigger requirements. The single-jet efficiency has been mea- 
sured in events containing four particle-flow jets in the central region, three of them matched 
to the trigger jets. The fourth jet is used as a probe jet and the single-jet efficiency is computed 
with respect to its match to the fourth trigger jet. The efficiency of a single particle-flow jet 
with pj ~ 45 GeV to pass the single-jet requirement of the trigger is 70 ± 1% (54 ± 1% for the 
more restrictive trigger). The jet trigger plateau is reached above ~120 GeV due to the different 
energy scale of particle-flow jets and calorimeter jets. 

The Th trigger efficiency has been measured in the events that satisfy the jet filter requirement 
and that contain a reconstructed Th candidate matched to one of the four trigger jets. The Th 
trigger plateau is reached for pj > 45 GeV (respectively pj > 50 GeV for the more stringent 
trigger) yielding an efficiency of 90 ± 1% (92 ± 1%). The trigger efficiency is modeled in sim- 
ulation by multiplying the trigger efficiencies obtained for the three most energetic central jets 
and the trigger efficiency obtained for the Th candidate. 

6 Background estimation 

The largest background for this analysis comes from high-multiplicity multijet events where 
one of the jets is misidentified as a Th, and represents approximately 90% of the expected back- 
ground. While control samples in data are used to evaluate the multijet background, the es- 
timation of the other contributions from tt backgrounds and electroweak processes, such as 
single top-quark production and W/Z + jets events, relies on MC simulation. Given the low 
expected signal over background ratio expected after preselection, an artificial neural network 
(ANN) is used to discriminate signal and background. 
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6 Background estimation 



6.1 Multijet background 

The multijet background is estimated from data by using the same selection as the preselected 
sample except that a veto is applied on the presence of a b-tagged jet. From simulated events, 
we expect the resulting sample, referred to as multijet sample, to contain less than 0.6% of 
tf — > Th + jets events, less than 0.3% of tt background events and less than 2.0% of W + jets and 
Z + jets events. Therefore the multijet sample provides a good representation of the multijet 
background and is used to train the ANN. 

To account for the kinematic bias of the b-tag veto in the multijet sample, as the b-tagging ef- 
ficiency depends on the jet momenta, the selected multijet events in data are weighted by the 
misidentification probability to select at least one b-jet in the event. This assumes that the jets 

are predominantly light flavoured: P (number of misidentified jets > 1) = Pi • TL ,e '(l — Pj) 

with j ^ i, where Pj stands for the misidentification probability of a light-flavoured jet and has 
been measured for different pj and rj bins in control samples in data [20 1. 

6.2 Artificial neural network 

The following seven variables are used to build an artificial neural network: the scalar sum 
of the transverse momenta of all the selected jets and the Th, Hj, the aplanarity the charge 
multiplied by the absolute value of the pseudorapidity of the Th candidate, cj{x-^) • 1^(^)1, the 
missing transverse energy, E™ 1SS , the azimuthal angle between the Th candidate and the missing 
transverse energy direction, A</>(Th, p^ iss ), the invariant mass of the system of all the selected 
jets and the Th candidate, M(jets, Th), and the x 2 returned by a kinematic fit constraining the 
hadronically-decaying W boson and top-quark masses. The aplanarity, A = \X\, is used to 
describe the spherical topology of the top-quark decay products: Ai is the smallest eigenvalue 

of the momentum tensor = Ya Pfpf / Hi \ Pi\ 2 > where i runs over the number of jets and 
the Th candidate and a, ft = 1, 2, 3 specify the three spatial components of the momentum. The 
Th charge multiplied by the absolute value of the pseudorapidity of the Th candidate, </(Th) • 
|j/(Th)|, is used to account for the charge-symmetric nature of tf events in contrast to W + jets 
events produced in proton-proton collisions. The Th charge is defined as the algebraic sum of 
the charges of the charged hadrons selected by the HPS algorithm. The training is performed 
using simulated tt — > Th + jets events passing the preselected sample criteria and events from 
the multijet sample. 

6.3 Signal yield extraction 

To minimise the statistical uncertainty of the cross section measurement, we fit the entire ANN 
output, Dnn, distribution rather than counting events above a given value. The extraction 
of the yield is performed via a two-component binned negative log-likelihood fit of the data 
to the shapes of expected signal and multijet background, derived from simulation and the 
multijet sample, respectively. The shapes for the tt background and the electroweak processes, 
and their normalizations are fixed to the expectation from simulation. Table [T] summarizes the 
contribution of the various processes. The number of signal events among the 3050 selected 
events is 383 ± 29. The fit uncertainty is given for the number of signal and multijet events, 
whereas for the remaining backgrounds the uncertainties are due to the limited size of the 
simulated samples. 

Figure [I] shows the fitted ANN output distribution. Figure [2] shows the distribution of M3, 
defined as the invariant mass of the three-jet system with highest transverse momentum in 
an enriched signal region, Dnn > 0.5. The selected jets are deemed to originate from the 
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Table 1: Estimated number of signal and multijet events after a fit to the ANN output distri- 
bution, and expected contributions of the electroweak processes and tt backgrounds from MC 
simulation. 



Source 


Events 


Signal tt — > Th + jets 


383 ± 29 


Multijet 


2392 ± 29 


Other tt 


151 ±4 


W+jets 


62 ±8 


Single top 


41 ±1 


Z+jets 


21 ±2 


Total backgrounds 


2667 ± 31 


Data 


3050 
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Figure 1: Distribution of the artificial neural network output variable after a fit of the signal 
and multijet processes to the data. Other background shapes and normalisations are fixed to 
the expectations from simulation. 



7 Cross section measurement 
7.1 Systematic uncertainties 

The main sources of systematic uncertainties are those due to uncertainty in the jet energy 
scale (JES), the energy correction, the identification, the trigger efficiency and in the E™ ss 
measurement. The uncertainty in the cross section measurement is obtained by combining the 
uncertainty in the signal acceptance and in the fitted number of signal events. The systematic 
uncertainties in the fitted number of signal events are estimated, when relevant, by iterating 
the fit on the ANN output in order to take into account possible shape variations of the ANN 



6 



7 Cross section measurement 



> 

CD 
O 



CMSVs^VeV, 3.9 fb" 



ro 
ro 
Q 



Data 

stat.+syst. uncertainty 

tf x h +jets 

single top 

W/Z + jets 

tf background 

multijet 




250 300 350 

M, (GeV), D >0.5 

3 v NN 



Figure 2: Distribution of the reconstructed M3 variable after a fit of the signal and multijet 
processes to the data, after requiring the ANN output value to be greater than 0.5. 



input variables. 

The uncertainties in the cross sections for the different simulated background processes are 
estimated from theoretical calculations |[T5l 121) . The uncertainty coming from the top-quark 
mass is evaluated considering two simulated samples where the nominal top-quark mass of 
172.5 GeV has been shifted by ±6 GeV. Scaling this uncertainty to the measured top-quark 
mass uncertainty of 1.1 GeV provides a 3% relative uncertainty in the measured cross section. 
The dependence of the selection on the renormalization and factorization scales, is estimated by 
varying these scales simultaneously by a factor of 0.5 and 2.0 from their default value equal to 
the hard-scattering Q 2 scale. The measured relative uncertainty for the tf processes is estimated 
to be 2%. The influence of the matching thresholds used to associate the matrix elements to the 
parton showers are varied from 20 GeV to respectively 10 GeV and 40 GeV. The measured rela- 
tive uncertainty for the tt processes is estimated to be 3%. The uncertainty of the choice of PDFs 
on the signal acceptance is estimated using the 2x11 reference PDFs associated to CTEQ6L1. 
The uncertainty of the choice of PDFs on the number of fitted signal events is determined iter- 
ating the fit on the ANN output distribution. Simulated events using the reference PDFs (out 
of the eleven available) leading to the maximal up (respectively maximal down) variation are 
used. 

The uncertainty induced by the statistical uncertainty of the trigger turn-on is computed using 
the uncertainties on the trigger turn-on curves versus the transverse momenta of the particle- 
flow jets and particle-flow t^. An additional ±5% uncertainty is assigned to the Th trigger 
efficiency measurement, since the data used to estimate the T-leg efficiency consist mainly of 
jets misidentified as Th candidates. 

The pileup uncertainty is estimated by varying the number of pileup interactions measured in 
data according to the theoretical uncertainty of the minimum bias inelastic cross section (±8%). 

The effect of the Th energy correction is estimated by varying the Th energy by ±3% [19|. The 
corrections are propagated to the trigger efficiency weights. The uncertainty due to the Th 
identification efficiency is estimated to be 6% Il9l . 



7.2 Measured cross section and branching fraction 
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The uncertainty due to the JES is estimated by rescaling up or down the jet energy by the 
uncertainties corresponding to one standard deviation. For the jet energy resolution (JER) the 
distribution of the jet energy has been smeared by one standard deviation. The corrections are 
propagated to the E2p iss measurement and to the trigger-efficiency measurement. The energy 
of the particles that are not clustered into jets is varied by ± 10%, leading to an additional 
uncertainty in the E2j? iss . 

The uncertainty due to applying b-tagging data-to-simulation scale factors for b, c and light- 
flavoured jets to the simulated events is estimated by shifting the value of the applied scale 
factors by the uncertainty corresponding to one standard deviation f22\ . The uncertainty in the 
reweighting method applied to the multijet data sample is estimated to be 5%. 

The uncertainty in the luminosity measurement is estimated to be 2.2% II23II . 

Table [2] summarizes the uncertainties entering the cross section measurement, split into sys- 
tematic and statistical ones. The statistical uncertainty includes the D^n fit uncertainty, the 
statistical uncertainty of the trigger turn-ons, as well as the uncertainty due to the limited size 
of the simulated samples. 

Table 2: Relative uncertainties in the cross section measurement. 



Source 


Rel. uncert. [%] 


W/Z/tt backgr. cross section uncert. 


±3 


Top-quark mass 


±3 


Renormalization/ factorization scale 


±2 


Parton matching 


±3 


PDF 


±5 


Th trigger efficiency 


±7 


Pileup 


+5-1 


T h energy correction 


±7 


Th identification 


±9 


Jet energy scale 


±11 


Jet energy resolution 


±2 


Unclustered E™ iss 


±7 


b-tagging 


±3 


Multijet background reweighting 


±5 


Syst. uncertainty 


±21 


Stat, uncert. from fit and MC samples 


±8 


Stat, uncert. from trigger 


±0.4 


Total stat. uncert. 


±8 



7.2 Measured cross section and branching fraction 

The measurement of the tt cross section in the T^+jets channel is performed using the following 
expression: = x^WJ^dt' wri ere N is the number of observed candidate events, Ng is the 

estimate of the background, f C dt is the integrated luminosity, A tot is the total acceptance, 
which contains the trigger efficiency and the efficiency of the offline event selection and B is 
the branching fraction of the T^+jets channel. 

Taking into account the systematic and statistical uncertainties reported in Table [2] and the 
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evaluated acceptance, A to t = 0.0066 ± 0.0001 (stat.) ± 0.0010 (syst.), the cross section is: 

tr fi = 152 ± 12 (stat.) ± 32 (syst.) ± 3 (lum.) pb. 

Using the number of fitted signal events and the theoretical tt production cross section, the 
branching fraction of the + jets channel is: 

B = 0.091 ± 0.009 (stat.) ± 0.019 (syst.) ± 0.002 (lum.). 

8 Summary 

Top-quark pairs in the r^+jets final state have been selected in a data sample from proton- 
proton collisions at yfs = 7TeV, corresponding to an integrated luminosity of 3.9fb _1 . Events 
were recorded by a dedicated multijet plus Th trigger, where events are selected with a mod- 
erate amount of E™ ss and four jets, at least one of which is b-tagged. The multijet back- 
ground is discriminated against using an artificial neural network technique. The result, cr^ = 
152 ± 12 (stat.) ± 32 (syst.) ± 3 (lum.) pb, is consistent with CMS measurements performed in 
other tt final states [Bl l24ti26l , as well as with the theoretical NNLO value of 164 ± 10 pb. The 
measured process is the dominant background to a charged Higgs search, where a significant 
deviation from the SM expectations would indicate the presence of new phenomena. 
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